Material for ” Combinatorial multi - armed bandit : general framework , results and applications

نویسندگان

  • Wei Chen
  • Yajun Wang
چکیده

We use the following two well known bounds in our proofs. Lemma 1 (Chernoff-Hoeffding bound). Let X1, · · · , Xn be random variables with common support [0, 1] and E[Xi] = μ. Let Sn = X1 + · · ·+Xn. Then for all t ≥ 0, Pr[Sn ≥ nμ+ t] ≤ e−2t /n and Pr[Sn ≤ nμ− t] ≤ e−2t /n Lemma 2 (Bernstein inequality). Let X1, . . . , Xn be independent zero-mean random variables. If for all 1 ≤ i ≤ n, |Xi| ≤ k, then for all t > 0,

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms and Its Applications

We study combinatorial multi-armed bandit with probabilistically triggered arms and semi-bandit feedback (CMAB-T). We resolve a serious issue in the prior CMAB-T studies where the regret bounds contain a possibly exponentially large factor of 1/p∗, where p∗ is the minimum positive probability that an arm is triggered by any action. We address this issue by introducing a triggering probability m...

متن کامل

Tighter Regret Bounds for Influence Maximization and Other Combinatorial Semi-Bandits with Probabilistically Triggered Arms

We study combinatorial multi-armed bandit with probabilistically triggered arms and semi-bandit feedback (CMAB-T). We resolve a serious issue in the prior CMAB-T studies where the regret bounds contain a possibly exponentially large factor of 1/p, where p is the minimum positive probability that an arm is triggered by any action. We address this issue by introducing a triggering probability mod...

متن کامل

Combinatorial Multi-Armed Bandit with General Reward Functions

In this paper, we study the stochastic combinatorial multi-armed bandit (CMAB) framework that allows a general nonlinear reward function, whose expected value may not depend only on the means of the input random variables but possibly on the entire distributions of these variables. Our framework enables a much larger class of reward functions such as the max() function and nonlinear utility fun...

متن کامل

Efficient Ordered Combinatorial Semi-Bandits for Whole-Page Recommendation

Multi-Armed Bandit (MAB) framework has been successfully applied in many web applications. However, many complex real-world applications that involve multiple content recommendations cannot fit into the traditional MAB setting. To address this issue, we consider an ordered combinatorial semi-bandit problem where the learner recommends S actions from a base set of K actions, and displays the res...

متن کامل

Combinatorial Multi-Armed Bandit: General Framework, Results and Applications

We define a general framework for a large class of combinatorial multi-armed bandit (CMAB) problems, where simple arms with unknown distributions form super arms. In each round, a super arm is played and the outcomes of its related simple arms are observed, which helps the selection of super arms in future rounds. The reward of the super arm depends on the outcomes of played arms, and it only n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013